Indexing and Retrieval of On-line Handwritten Documents
نویسندگان
چکیده
Recent advances in on-line data capturing technologies and its widespread deployment in devices like PDAs and notebook PCs is creating large amounts of handwritten data that need to be archived and retrieved efficiently. Word-spotting, which is based on a direct comparison of a handwritten keyword to words in the document, is commonly used for indexing and retrieval. We propose a string matching-based method for word-spotting in on-line documents. The retrieval algorithm achieves a precision of 92.3% at a recall rate of 90% on a database of 6, 672 words written by 10 different writers. Indexing experiments show an accuracy of 87.5% using a database of 3, 872 on-line
منابع مشابه
Content-based Information Retrieval from Handwritten Documents
This paper is about retrieving the closest matches from a set of scanned handwritten documents based on a query that is a document image. System indexing and retrieval is based on writer characteristics, textual content as well as document meta data such as writer profile. Documents are indexed using global image features, e.g., stroke width, slant, word gaps, as well local features that descri...
متن کاملConnected Component Based Word Spotting on Persian Handwritten image documents
Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...
متن کاملIndexing and retrieval of handwritten medical forms
POSTER PAPER. This paper proposes an approach of indexing and retrieving degraded handwritten documents. We present a modified version of the popular Vector Model in information retrieval (IR). Our model incorporates top n candidates from a HR system into the scheme of calculating the term frequency (tf) and the inverted document frequency (idf). Standardized IR Tests show that the proposed app...
متن کاملOffline Automatic Segmentation based Recognition of Handwritten Arabic Words
The world heritage of handwritten Arabic documents is huge however only manual indexing and retrieval techniques of the content of these documents are available. To facilitate an automatic retrieval of such handwritten Arabic document, a number of automatic recognition systems for handwritten Arabic words have been proposed. Nevertheless, these systems suffer from low recognition accuracy due t...
متن کاملOnline Writer Identification Using Fuzzy C-means Clustering of Character Prototypes
New kinds of documents such as handwritten online documents are emerging, which are produced by digital devices such as Tablet PC, personal handheld devices or digital paper coupled with digital pens. The rapid increase in the number of such handwritten online documents leads to mounting pressure on finding innovative solutions towards faster processing, indexing and retrieval of the documents ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003